Preparing today’s CRAN package database

Download today’s CRAN database

library("cranly")
p_db <- tools::CRAN_package_db()

Next we need to clean and organise author names, depends, imports, suggests, enhances

package_db <- clean_CRAN_db(p_db)

The resulting dataset carries the timestamp of when it was put together, which helps keeping track of when the data import has taken place and will be helpful in future versions when dynamic analyses and visualisation methods are implemented.

attr(package_db, "timestamp")
#> [1] "2018-03-23 10:56:57 GMT"

Network of package directives

We can now extract edges and nodes for the CRAN package directives network by simply doing

package_network <- build_network(object = package_db)

and compute various statistics for the package network

## Global package network statistics
package_summaries <- summary(package_network)

The package_summaries object can now be used for finding the top-20 packages according to various statistics

plot(package_summaries, according_to = "n_imported_by", top = 20)

plot(package_summaries, according_to = "page_rank", top = 20)

plot(package_summaries, according_to = "betweenness", top = 20)

plot(package_summaries, according_to = "n_enhances", top = 20)

plot(package_summaries, according_to = "n_authors", top = 20)

plot(package_summaries, according_to = "n_imports", top = 20)

The sub-network for my packages can be found using the extractor function package_of which use exact matching by default

my_packages <- package_by(package_network, "Ioannis Kosmidis")
my_packages
#> [1] "betareg"      "brglm"        "brglm2"       "enrichwith"  
#> [5] "PlackettLuce" "profileModel" "trackeR"

We can now get an interactive visualisation of the sub-network for my packages using

visualize(package_network, package = my_packages, title = TRUE, legend = TRUE)

You can hover over the nodes and the edges to get package-specific information and links to the package pages.

CRAN collaboration network

Next let’s build the CRAN collaboration network

author_network <- build_network(object = package_db, perspective = "author")

Statistics for the collaboration network can be computed using the summary method as we did for package directives.

author_summaries <- summary(author_network)

The top-20 collaborators according to various network statistics are

plot(author_summaries, according_to = "n_packages", top = 20)

plot(author_summaries, according_to = "page_rank", top = 20)

plot(author_summaries, according_to = "betweenness", top = 20)

The R Core’s collaboration sub-network is

visualize(author_network, author = "R Core")

Brian Ripley’s collaboration sub-network is

visualize(author_network, author = "Brian Ripley")

and my (small but valuable to me!) collaboration sub-network is

visualize(author_network, author = "Kosmidis")